Rough K-means Outlier Factor Based on Entropy Computation

نویسندگان

  • Djoko Budiyanto Setyohadi
  • Azuraliza Abu Bakar
  • Zulaiha Ali Othman
چکیده

Many studies of outlier detection have been developed based on the cluster-based outlier detection approach, since it does not need any prior knowledge of the dataset. However, the previous studies only regard the outlier factor computation with respect to a single point or a small cluster, which reflects its deviates from a common cluster. Furthermore, all objects within outlier cluster are assumed to be similar. The outlier objects intuitively can be grouped into the outlier clusters and the outlier factors of each object within the outlier cluster should be different gradually. It is not natural if the outlierness of each object within outlier cluster is similar. This study proposes the new outlier detection method based on the hybrid of the Rough K-Means clustering algorithm and the entropy computation. We introduce the outlier degree measure namely the entropy outlier factor for the cluster based outlier detection. The proposed algorithm sequentially finds the outlier cluster and calculates the outlier factor degree of the objects within outlier cluster. Each object within outlier cluster is evaluated using entropy cluster-based to a whole cluster. The performance of the algorithm has been tested on four UCI benchmark data sets and show outperform especially in detection rate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Approach for Outlier Detection using Rough Entropy

Outlier detection is an important task in data mining and its applications. It is defined as a data point which is very much different from the rest of the data based on some measures. Such a data often contains useful information on abnormal behavior of the system described by patterns. In this paper, a novel method for outlier detection is proposed among inconsistent dataset. This method expl...

متن کامل

Performance Analysis of Entropy based methods and Clustering methods for Brain Tumor Segmentation

Brain tumor is the most deadly disease that affects human life span. To segment the brain tumor part, many segmentation techniques have been emerged in image processing like region based Segmentation, Boundary based segmentation. In this paper, several entropies based methods and several cluster techniques are compared and analyzed for brain tumor segmentation. Several entropies such as rough e...

متن کامل

A Fuzzy Clustering Approach for Missing Value Imputation with Non-Parameter Outlier Test

Missing value is a challenging issue in data mining, as information deficiency negatively affects both data quality and reliability. This paper focuses on an algorithm of a fuzzy clustering approach for missing value imputation with noisy data immunity. The PCFKMI (Pre-Clustering based Fuzzy K-Means Imputation) method aggregates data instances to more accurate clusters for further appropriate e...

متن کامل

Granular computing, rough entropy and object extraction

The problem of image object extraction in the framework of rough sets and granular computing is addressed. A measure called ‘‘rough entropy of image’’ is defined based on the concept of image granules. Its maximization results in minimization of roughness in both object and background regions; thereby determining the threshold of partitioning. Methods of selecting the appropriate granule size a...

متن کامل

RODHA: Robust Outlier Detection using Hybrid Approach

The task of outlier detection is to find the small groups of data objects that are exceptional to the inherent behavior of the rest of the data. Detection of such outliers is fundamental to a variety of database and analytic tasks such as fraud detection and customer migration. There are several approaches[10] of outlier detection employed in many study areas amongst which distance based and de...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014